Learning The Meaning And Usage Of Time Phrases From A Parallel Text-Data Corpus
نویسندگان
چکیده
We present an empirical corpus study of the meaning and usage of time phrases in weather forecasts; this is based on a novel corpus analysis technique where we align phrases from the forecast text with data extracted from a numerical weather simulation. Previous papers have summarised this analysis and discussed the substantial variations we discovered among individual writers, which was perhaps our most surprising finding. In this paper we describe our analysis procedure and results in considerably more detail, and also discuss our current work on using parallel text-data corpora to learn the meanings of other types of words.
منابع مشابه
Corpus based coreference resolution for Farsi text
"Coreference resolution" or "finding all expressions that refer to the same entity" in a text, is one of the important requirements in natural language processing. Two words are coreference when both refer to a single entity in the text or the real world. So the main task of coreference resolution systems is to identify terms that refer to a unique entity. A coreference resolution tool could be...
متن کاملروش جدید متنکاوی برای استخراج اطلاعات زمینه کاربر بهمنظور بهبود رتبهبندی نتایج موتور جستجو
Today, the importance of text processing and its usages is well known among researchers and students. The amount of textual, documental materials increase day by day. So we need useful ways to save them and retrieve information from these materials. For example, search engines such as Google, Yahoo, Bing and etc. need to read so many web documents and retrieve the most similar ones to the user ...
متن کاملCorpora and Cognition: The Semantic Composition of Adjectives and Nouns in the Human Brain
The action of reading, understanding and combining words to create meaningful phrases comes naturally to most people. Still, the processes that govern semantic composition in the human brain are not well understood. In this thesis, we explore semantics (word meaning) and semantic composition (combining the meaning of multiple words) using two data sources: a large text corpus, and brain recordi...
متن کاملIterative Learning of Parallel Lexicons and Phrases from Non-Parallel Corpora
While parallel corpora are an indispensable resource for data-driven multilingual natural language processing tasks such as machine translation, they are limited in quantity, quality and coverage. As a result, learning translation models from nonparallel corpora has become increasingly important nowadays, especially for low-resource languages. In this work, we propose a joint model for iterativ...
متن کاملCorpus-Driven Study of Translation Units in an English-Chinese Parallel Corpus
It is widely acknowledged that texts are not translated word by word, but unit by unit. Single words are polysemous and therefore ambiguous in translation. Corpus linguistics, in monolingual context, has replaced the traditional basic notion of meaning (words) with the extended unit of meaning. Accordingly, this paper argues that in bilingual context, the translation unit, as the counterpart co...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003